AITopics | initial state distribution

Collaborating Authors

initial state distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-agentactiveperceptionwithpredictionrewards

Neural Information Processing SystemsFeb-9-2026, 13:44:45 GMT

This supplementary document is structured as follows.

artificial intelligence, controller state, prediction action, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Germany > Hamburg (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

58b286aea34a91a3d33e58af0586fa40-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 04:01:19 GMT

algorithm, graph, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Towards Generalization and Simplicity in Continuous Control

Aravind Rajeswaran, Kendall Lowrey, Emanuel V. Todorov, Sham M. Kakade

Neural Information Processing SystemsNov-21-2025, 12:02:14 GMT

In this backdrop, we ask the pertinent question: "What are the simplest set of ingredients needed

architecture, gradient, perturbation, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

Reviews: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

Neural Information Processing SystemsNov-18-2025, 22:16:52 GMT

NeurIPS 2019 Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center "1361" "DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections" Reviewer 1 Originality: I find this work to be original and the proposed algorithm to be novel. The authors clearly state what they contributions are and how their work differs itself from the prior works. Clarity/Quality: The paper is clearly written and is easy to follow, the authors do a great job stating the problem they consider, explaining existing solutions and their drawbacks, and then thoroughly building up the intuition behind their approach. Each theoretical step makes sense and is intuitive. I also appreciate the authors taking time to deriving their method using a simple convex function and then demonstrating that it is possible to extend the method to more general set of functions.

artificial intelligence, objective function, stationary distribution correction, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Autonomous Reinforcement Learning via Subgoal Curricula

Neural Information Processing SystemsNov-15-2025, 05:57:59 GMT

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents.

curriculum, initial state distribution, reinforcement, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Instructional Material (0.47)
Research Report (0.46)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability

Bovy, Eline M., Probine, Caleb, Suilen, Marnix, Topcu, Ufuk, Jansen, Nils

arXiv.org Artificial IntelligenceOct-29-2025

Multi-environment POMDPs (ME-POMDPs) extend standard POMDPs with discrete model uncertainty. ME-POMDPs represent a finite set of POMDPs that share the same state, action, and observation spaces, but may arbitrarily vary in their transition, observation, and reward models. Such models arise, for instance, when multiple domain experts disagree on how to model a problem. The goal is to find a single policy that is robust against any choice of POMDP within the set, i.e., a policy that maximizes the worst-case reward across all POMDPs. We generalize and expand on existing work in the following way. First, we show that ME-POMDPs can be generalized to POMDPs with sets of initial beliefs, which we call adversarial-belief POMDPs (AB-POMDPs). Second, we show that any arbitrary ME-POMDP can be reduced to a ME-POMDP that only varies in its transition and reward functions or only in its observation and reward functions, while preserving (optimal) policies. We then devise exact and approximate (point-based) algorithms to compute robust policies for AB-POMDPs, and thus ME-POMDPs. We demonstrate that we can compute policies for standard POMDP benchmarks extended to the multi-environment setting.

artificial intelligence, machine learning, pomdp, (17 more...)

arXiv.org Artificial Intelligence

2510.23744

Country:

Europe (0.92)
North America > United States > Texas (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Diversify & Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement

Neural Information Processing SystemsOct-9-2025, 04:00:05 GMT

D2C requires only a few examples of desired outcomes and works in any environment, regardless of its geometry or the distribution of the desired outcome examples.

curriculum goal, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Italy > Sardinia > Cagliari (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Robots (0.69)

Add feedback

Initial Distribution Sensitivity of Constrained Markov Decision Processes

Tercan, Alperen, Ozay, Necmiye

arXiv.org Artificial IntelligenceOct-2-2025

Constrained Markov Decision Processes (CMDPs) are notably more complex to solve than standard MDPs due to the absence of universally optimal policies across all initial state distributions. This necessitates re-solving the CMDP whenever the initial distribution changes. In this work, we analyze how the optimal value of CMDPs varies with different initial distributions, deriving bounds on these variations using duality analysis of CMDPs and perturbation analysis in linear programming. Moreover, we show how such bounds can be used to analyze the regret of a given policy due to unknown variations of the initial distribution.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2510.00348

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: